Decoder

ABSTRACT

A decoder for decoding an encoded audio signal from a first part of the encoded audio signal, wherein the decoder is configured to: receive a first part of an encoded audio signal; determine at least one scaling factor dependent on the first part of the encoded audio signal; scale the first part of the encoded audio signal dependent on the at least one scaling factor to produce a scaled encoded audio signal; and decode the scaled encoded audio signal.

FIELD OF THE INVENTION

The present invention relates to coding, and in particular, but notexclusively to speech or audio coding.

BACKGROUND OF THE INVENTION

Audio signals, such as speech or music, are encoded for example in orderto enable an efficient transmission or storage of audio signals.

Audio (encoders and decoders) codecs are used to represent audio basedsignals, such as music and background noise. These codecs typically donot utilise a speech model during their coding process, instead theytend to use more generic methods which are suited for representing mosttypes of audio signals, including speech. Whereas speech codecs areusually optimised for speech signals, and can often operate at a fixedbit rate, and sampling rate.

Audio codecs can be configured to operate with varying bit rates over awide range of sampling frequencies, and this is very often the preferredmode of operation for the many audio codecs such as the Advanced AudioCodec (AAC). Details of AAC can be found in the ISO/IEC 14496-3 Subpart4 General Audio Coding (GA) technical specification. At lower bit rates,such audio codecs may work with speech or audio signals at a coding rateequivalent to a pure speech codec. In such circumstances, for speech atleast, the speech codec will out perform a pure audio codec in terms ofquality. This is due mainly to the utilisation by many speech codecs ofthe vocal tract model. However, at higher bit rates the performance ofan audio codec may be good with any class of audio signal includingmusic, background noise and speech.

A further audio coding option is an embedded variable rate speech oraudio coding scheme, which is also referred as a layered or scalablecoding scheme. Embedded variable rate audio or speech coding denotes anaudio or speech coding scheme, in which a bit stream resulting from thecoding operation is distributed into successive layers. A base or corelayer which comprises of primary coded data generated by a core encoderis formed of the binary elements essential for the decoding of thebinary stream, and determines a minimum quality of decoding. Subsequentlayers make it possible to progressively improve the quality of thesignal arising from the decoding operation, where each new layer bringsnew information. One of the particular features of layered based codingis the possibility offered of intervening at any level whatsoever of thetransmission or storage chain, so as to delete a part of binary streamwithout having to include any particular indication to the decoder. Thedecoder uses the binary information that it receives and produces asignal of corresponding quality. For instance InternationalTelecommunications Union Technical (ITU-T) standardisation aims at awideband codec of 50 to 7000 Hz with bit rates from 8 to 32 kbps. Thecodec core layer will either work at 8 kbps or 12 kbps, and additionallayers with quite small granularity will increase the observed speechand audio quality. The proposed layers will have as a minimum target atleast five bit rates of 8, 12, 16, 24 and 32 kbps available from thesame embedded bit stream.

By the very nature of layered, or scalable, based coding schemes thestructure of the codecs tend to be hierarchical in form, consisting ofmultiple coding stages. Typically different coding techniques are usedfor the core (or base) layer and the additional layers. The codingmethods used in the additional layers are then used to either code thoseparts of the signal which have not been coded by previous layers, or tocode a residual signal from the previous stage. The residual signal isformed by subtracting a synthetic signal i.e. a signal generated as aresult of the previous stage from the original.

Typically techniques used for low bit rate coding do not perform well athigher bit rates and vice versa. By adopting this hierarchical approach,a combination of coding methods make it possible to reduce the output torelatively low bit rates but retaining sufficient quality, whilst alsoproducing good quality audio reproduction by using higher bit rates.This has resulted in structures using two different coding technologies.The codec core layer is typically a speech codec based on the CodeExcited Linear Prediction (CELP) algorithm or a variant such as adaptivemulti-rate (AMR) CELP and variable multi-rate (VMR) CELP.

Details of the AMR codec can be found in the 3GPP TS 26.090 technicalspecification, the AMR-WB codec 3GPP TS 26.190 technical specification,and the AMR-WB+ in the 3GPP TS 26.290 technical specification.

A similar scalable audio codec is the VMR-WB codec (Variable Multi-RateWide Band) was developed with regards to the CDMA 2000 communicationsystem.

Details on the VMR-WB codec can be found in the 3GPP2 technicalspecification C.S0052-0. In a manner similar to the AMR family thesource control VMR-WB audio codec also uses ACELP coding as a corecoder.

The higher layers utilise techniques more akin to audio coding such astime frequency transformations as described in the prior art“Analysis/Synthesis Filter Bank Design Based on Time Domain AliasingCancellation” by J. P. Princen et al. (IEEE Transactions on ASSP, VolASSP-34, No. 5. October 1986)

However these higher level signals are not optimally coded. For example,the codec described in Ragot et al, “A 8-32 Kbit/s scalable widebandspeech and audio coding candidate for ITU-T G.729EV standardisation”published in Acoustics, Speech and Signal Processing 2006, ICASSP 2006proceedings, 2006 IEEE International Conference Volume 1, page I-1 toI-4 describes scalable wideband audio coding.

A further example of an audio codec is from US patent applicationpublished as number 2006/0036435. This audio codec describes where thenumber of coding bits per frequency parameter is selected dependent onthe perceptual importance of the frequency. Thus parameters representing‘perceptually more important’ frequencies are coded using more bits thanthe number of bits used to code ‘perceptually less important’ frequencyparameters. Typically in an audio signal this means that lowerfrequencies, which are perceived to be more perceptually important thanhigher ones, are coded using more bits.

In scalable layered audio codecs of such type it is normal practice toarrange the various coding layers in order of perceptual importance.Whereby the bits associated with the quantisation of the perceptuallyimportant frequencies, which is typically the lower frequencies, isassigned to a lower and therefore perceptually more important codinglayer. Consequently where the channel or storage chain is constrained,the decoder may not receive all coding layers. Therefore the highercoding layers, which are typically associated with the higherfrequencies of the coded signal, are not decoded. This has the undesiredeffect of changing the timbre of the signal by making it perceptuallydull in character.

SUMMARY OF THE INVENTION

This invention proceeds from the consideration that coding an audiosignal as a number of layers results in the undesirable effect of makingthe resulting audio signal dull in timbre. This is a consequence ofstripping out higher coding layers during the transmission or storagechain, thereby removing the energy present in the higher frequencies.

It would be possible to emphasise any remaining high frequencycomponents, in order to return some of the lost brightness to the timbreof the received audio signal. While this can increase the energy in thehigher frequencies, the naturalness of the decoded signal can becompromised to some extent. This approach implies that there is atrade-off between the emphasis of the higher frequencies and the loss ofnaturalness in the decoded signal.

Embodiments of the present invention aim to address the above problem.

According to the present invention there is provided a decoder fordecoding an encoded audio signal from a first part of the encoded audiosignal, wherein the decoder is configured to: receive a first part of anencoded audio signal; determine at least one scaling factor dependent onthe first part of the encoded audio signal; scale the first part of theencoded audio signal dependent on the at least one scaling factor toproduce a scaled encoded audio signal; and decode the scaled encodedaudio signal.

The encoded audio signal may comprises at least one set of spectralvalues, and the first part of the encoded audio signal comprises: atleast one sub-set of spectral values, each sub-set of spectral valuesassociated with one of the at least one set of spectral values; and atleast one set scaling factor, each set scaling factor being associatedwith one of the at least one set of spectral values.

Each of the at least one scaling factor is preferably associated withone of the at least one set of spectral values, wherein the decoder ispreferably configured to scale the sub-set of spectral values associatedwith one of the at least one set of spectral values by the respectivescaling factor.

Each scaling factor may comprise a first term dependent on therespective sub-set of spectral values and a second term dependent on thefirst term and the respective set scaling factor.

The first term of the scaling factor may comprise the total spectralenergy value of the respective sub-set of spectral values.

The total spectral energy value of the respective sub-set of spectralvalues may comprise at least one of: a combination of an absolute valueof each spectral value of the respective sub-set of spectral values; anda combination of a squared value of each spectral value of therespective sub-set of spectral values.

Each set scaling factor may comprise at least one of: the average energyper spectral value for the respective set of spectral values; theaverage energy per spectral value for all sets of spectral values.

The second term may comprise the combination of the first term and theproduct of the respective set scaling factor and a multiplier.

The decoder is preferably configured to determine the value of themultiplier by subtracting the number of spectral values in therespective sub-set of spectral values from the number of spectral valuesin the set of spectral values.

The decoder may further be configured to determine the number ofspectral values in a set of spectral values; and for each of the numberof spectral value in a set of spectral values the decoder is preferablyconfigured to: determine whether each of the number of spectral valuesis within the sub-set of spectral values.

The decoder may be further configured to accumulate the second term bythe set scaling factor when the decoder determines that the spectralvalue is not within the sub-set of spectral values.

The decoder may be further configured to accumulate the first term andthe second term by a respective sub-set spectral value when the decoderdetermines that the spectral value is in the sub-set of spectral values.

Each scaling factor may comprise the first term normalised by the secondterm.

Each spectral value may comprise a discrete orthogonal transform basisvector weighting coefficient.

The discrete orthogonal transform may comprise a modified discretecosine transform.

Each scaling factor may comprise the ratio of the first term to thesecond term.

The received encoded audio signal may comprise individual coding layers.

The at least one scaling factor may be an emphasis scaling factor.

According to a second aspect of the invention there is provided a methodfor decoding an encoded audio signal from a first part of the encodedaudio signal, wherein the method comprises: receiving a first part of anencoded audio signal; determining at least one scaling factor dependenton the first part of the encoded audio signal; scaling the first part ofthe encoded audio signal dependent on the at least one scaling factor toproduce a scaled encoded audio signal; and decoding the scaled encodedaudio signal.

The encoded audio signal may comprise at least one set of spectralvalues, and the first part of the encoded audio signal may comprise: atleast one sub-set of spectral values, each sub-set of spectral valuesassociated with one of the at least one set of spectral values; and atleast one set scaling factor, each set scaling factor being associatedwith one of the at least one set of spectral values.

Each of the at least one scaling factor may be associated with one ofthe at least one set of spectral values, wherein the scaling the firstpart of the encoded audio signal may comprise scaling the sub-set ofspectral values associated with one of the at least one set of spectralvalues by the respective scaling factor.

Determining at least one scaling factor may comprise determining a firstterm dependent on the respective sub-set of spectral values anddetermining a second term dependent on the first term and the respectiveset scaling factor.

Determining the first term may comprise determining the total spectralenergy value of the respective sub-set of spectral values.

Determining the total spectral energy value of the respective sub-set ofspectral values may comprise at least one of: determining a combinationof an absolute value of each spectral value of the respective sub-set ofspectral values; and determining a combination of a squared value ofeach spectral value of the respective sub-set of spectral values.

Each set scaling factor may comprise at least one of: the average energyper spectral value for the respective set of spectral values; theaverage energy per spectral value for all sets of spectral values.

Determining the second term may comprise combining the first term and aproduct of the respective set scaling factor and a multiplier.

The method may further comprise determining the value of the multiplierby subtracting the number of spectral values in the respective sub-setof spectral values from the number of spectral values in the set ofspectral values.

The method may further comprise: determining a number of spectral valuesin a set of spectral values; and for each of the number of spectralvalue in a set of spectral values the method comprises determiningwhether the spectral value is within the sub-set of spectral values.

The method may further comprise accumulating the second term by the setscaling factor when the spectral value is determined to not be withinthe sub-set of spectral values.

The method may further comprise accumulating the first term and thesecond term by the respective sub-set spectral value when the spectralvalue is in the sub-set of spectral values.

Determining each scaling factor may comprise normalising the first termby the second term.

Each spectral value is preferably a discrete orthogonal transform basisvector weighting coefficient.

The discrete orthogonal transform is preferably a modified discretecosine transform.

Determining each scaling factor may comprise the ratio of the first termto the second term.

The received encoded audio signal may comprise individual coding layers.

The at least one scaling factor is preferably an emphasis scalingfactor.

According to a third aspect of the invention there is provided anapparatus comprising a decoder as described above.

According to a fourth aspect of the invention there is provided anelectronic device comprising a decoder as described above.

According to a fifth aspect of the invention there is provided acomputer program product configured to perform a method for decoding anencoded audio signal from a first part of the encoded audio signal,wherein the method comprises: receiving a first part of an encoded audiosignal; determining at least one scaling factor dependent on the firstpart of the encoded audio signal; scaling the first part of the encodedaudio signal dependent on the at least one scaling factor to produce ascaled encoded audio signal; and decoding the scaled encoded audiosignal.

According to a sixth aspect of the invention there is provided a decoderfor decoding an encoded audio signal from a first part of the encodedaudio signal, wherein the decoder is configured to: means for receivinga first part of an encoded audio signal; means for determining at leastone scaling factor dependent on the first part of the encoded audiosignal; means for scaling the first part of the encoded audio signaldependent on the at least one scaling factor to produce a scaled encodedaudio signal; and means for decoding the scaled encoded audio signal.

BRIEF DESCRIPTION OF DRAWINGS

For better understanding of the present invention, reference will now bemade by way of example to the accompanying drawings in which:

FIG. 1 shows schematically an electronic device employing embodiments ofthe invention;

FIG. 2 shows schematically an audio decoder according to an embodimentof the present invention;

FIG. 3 shows a flow diagram illustrating the operation of an embodimentof the audio decoder according to the present invention;

FIG. 4 shows a flow diagram illustrating part of the operation shown inFIG. 3, according to a first embodiment of the invention; and

FIG. 5 shows a flow diagram illustrating part of the operation shown inFIG. 3, according to a second embodiment of the invention.

DESCRIPTION OF PREFERRED EMBODIMENTS OF THE INVENTION

The following describes in more detail possible codec mechanisms for theprovision of adaptive or variable audio codecs. In this regard referenceis first made to FIG. 1 schematic block diagram of an exemplaryelectronic device 610, which may incorporate a codec according to anembodiment of the invention.

The electronic device 610 may for example be a mobile terminal or userequipment of a wireless communication system.

The electronic device 610 comprises a microphone 611, which is linkedvia an analogue-to-digital converter 614 to a processor 621. Theprocessor 621 is further linked via a digital-to-analogue converter 632to loudspeakers 633. The processor 621 is further linked to atransceiver (TX/RX) 613, to a user interface (UI) 615 and to a memory622.

The processor 621 may be configured to execute various program codes.The implemented program codes comprise an audio encoding code forencoding a lower frequency band of an audio signal and a higherfrequency band of an audio signal. The implemented program codes 623further comprise an audio decoding code. The implemented program codes623 may be stored for example in the memory 622 for retrieval by theprocessor 621 whenever needed. The memory 622 could further provide asection 624 for storing data, for example data that has been encoded inaccordance with the invention.

The encoding and decoding code may in embodiments of the invention beimplemented in hardware or firmware.

The user interface 615 enables a user to input commands to theelectronic device 610, for example via a keypad, and/or to obtaininformation from the electronic device 610, for example via a display.The transceiver 613 enables a communication with other electronicdevices, for example via a wireless communication network.

It is to be understood again that the structure of the electronic device610 could be supplemented and varied in many ways.

A user of the electronic device 610 may use the microphone 611 forinputting speech that is to be transmitted to some other electronicdevice or that is to be stored in the data section 624 of the memory622. A corresponding application has been activated to this end by theuser via the user interface 615. This application, which may be run bythe processor 621, causes the processor 621 to execute the encoding codestored in the memory 622.

The analogue-to-digital converter 614 converts the input analogue audiosignal into a digital audio signal and provides the digital audio signalto the processor 621.

The processor 621 may then process the digital audio signal in the sameway as described with reference to FIGS. 2 and 3.

The resulting bit stream is provided to the transceiver 613 fortransmission to another electronic device. Alternatively, the coded datacould be stored in the data section 624 of the memory 622, for instancefor a later transmission or for a later presentation by the sameelectronic device 610.

The electronic device 610 could also receive a bit stream withcorrespondingly encoded data from another electronic device via itstransceiver 613. In this case, the processor 621 may execute thedecoding program code stored in the memory 622. The processor 621decodes the received data, for instance in the same way as describedwith reference to FIGS. 4 and 5, and provides the decoded data to thedigital-to-analogue converter 632. The digital-to-analogue converter 632converts the digital decoded data into analogue audio data and outputsthem via the loudspeakers 633. Execution of the decoding program codecould be triggered as well by an application that has been called by theuser via the user interface 615.

The received encoded data could also be stored instead of an immediatepresentation via the loudspeakers 633 in the data section 624 of thememory 622, for instance for enabling a later presentation or aforwarding to still another electronic device.

It would be appreciated that the schematic structures described in FIG.2 and the method steps in FIGS. 3 to 5 represent only a part of theoperation of a complete audio codec as exemplarily shown implemented inthe electronic device shown in FIG. 1. The general operation of audiocodecs is known and features of such codecs which do not assist in theunderstanding of the operation of the invention are not described indetail.

The embodiment of the invention audio codec comprises an encoderpart—which converts audio signals into encoded signals and a decoderpart—which converts encoded signals into replicas of the audio signaloriginally coded in the encoder part.

The encoder is not described in detail within the application. Howeverfurther information on encoders may be found in the co-pendingapplications [PWF reference 314217/KCS/GJS and 314261/KCS/GJS]. Theencoder typically receives the audio signal and encodes the audio signalas a series of layers. The ‘core’ layers typically comprise informationrelated to parameters generated from the core codec. The ‘higher’ layerstypically comprise information related to the difference between theoriginal audio signal and a synthesised copy of the audio signalgenerated by decoding the ‘lower layer’ parameters. The ‘core layers’and at least some of the ‘higher layers’ are then multiplexed togetherand passed to the decoder for decoding.

With respect to FIG. 2, an example of a decoder 400 for the codec asimplemented in embodiments of the invention is shown. The decoder 400receives an encoded signal and outputs a replica of the original audiooutput signal.

The decoder comprises a demultiplexer 401, which receives the encodedsignal and outputs a series of data streams. The demultiplexer 401 isconnected to a core decoder 471 for passing the core level bitstreams(which can be referred to as the R1 and R2 layers in this embodiment).

Although the above embodiments have been described as producing corelevels or layers described above as the R1 and R2 layers, it is to beunderstood that further embodiments may adopt differing number of coreencoding layers, thereby being capable of achieving different levels ofgranularity in terms of both bit rate and audio quality.

The demultiplexer 401 is also connected to a difference decoder 473 foroutputting the higher level bitstreams (which can be referred to as theR3, R4, and R5 in this embodiment). The core decoder 471 may beconnected to connected to a summing device 413 via a delay element 410which also receives a synthesized signal.

The higher coding layers (referred to as R3, R4 and/or R5) encode thesignal at a progressively higher bit rate and quality level. It is to beunderstood that further embodiments may adopt differing number ofencoding layers, thereby achieving a different level of granularity interms of both bit rate and audio quality.

The core decoder may be connected to a synthesized signal decoder (notshown in FIG. 2). The synthesized signal decoder (not shown in FIG. 2)may then be connected to the difference decoder 473 for passing locallygenerated scaling factors for each sub-band from the core level decodersynthetic signal. These factors typically take the form of an energymeasure, including inter alia, root mean square, average energy, peakmagnitude. This value may form a scaling factor for a sub-band. However,it is equally likely to be used in conjunction with other values whichmay be transmitted as part of the encoded bit stream, to form a combinedscaling factor.

The difference decoder 473 is also connected to the summing device 413to pass a difference signal to the summing device. The summing device413 has an output which is an approximation of the original signal.

The demultiplexer 401 receives the encoded signal, shown in FIG. 3 bystep 501.

The demultiplexer 401 is further arranged to separate the core levelsignals (R1 and/or R2) from the higher level signals (R3, R4, and/orR5). This step is shown in FIG. 3 in step 503.

The core level signals are passed to the core decoder 471 and the higherlevel signals passed to the difference decoder 473.

The core decoder 471, using the core codec 403, receives the core levelsignal (the core codec encoded parameters) discussed above and isconfigured to perform a decoding of these parameters to produce anoutput similar to that produced by a synthesized signal output by a corecodec 203 in an encoder.

For embodiments where the scalable coding systems core codec isoperating at a lower sampling rate than the original input signal, theencoder may have performed pre-processing on the audio signal prior tothe application of the core-codec and therefore also performpost-processing on a synthesized signal to return the synthesized signalto the same sample rate as the original audio signal. The synthesizedsignal may for example be up-sampled by the post processor 405 toproduce a synthesized signal similar to the synthesized signal output bythe core encoder 271 in the encoder 200.

In embodiments where the scalable layered coding systems core codec isoperating at the same sampling rate as the original, the post processingstage may be omitted from the decoder.

This synthesized signal is passed via the delay element 410 to thesumming device 413. In embodiments of the invention, where for examplethe difference decoder performs a scaling or re-ordering dependent onparameters generated from the synthesized signal, the synthesized signalmay be then also passed to the difference decoder 473 as shown in FIG. 2by the dashed connection between the core decoder 471 and the differencedecoder 473.

The generation of the synthesized signal step is shown in FIG. 5 by step505 c.

The difference decoder 473 passes the higher level signals to thedifference processor 409.

The difference processor 409 demultiplexes from the higher level signalsthe received scale factors and the quantized sub-vectors whoseconstituent components are formed from the scaled frequencycoefficients, such as MDCT inter alia.

The difference processor 409 may re-index the received scale factors andthe quantized sub-vectors. The re-indexing returns the scale factors andthe quantized sub-vectors to the order prior to an indexing carried outin an encoder.

The difference processor 409 may also de-interlace or de-order thesub-vectors according to any de-interlacing or de-ordering process. Thisprocess is carried out to return the order of the sub-vectors to theorder prior to any interlacing or re-ordering carried out in an encoder.

The re-indexing/de-ordering is shown in FIG. 3 as step 505.

The scaling of the sub-vectors in embodiments of the invention maycomprise at least two separate scaling actions.

The difference processor 409 may perform a de-scaling action. Thede-scaling of the sub-vectors modifies the values of each of thesub-vectors so that each sub-vector approximates the value of therelated sub-vector prior to any encoder scaling.

The de-scaling of the sub-vectors is shown in FIG. 3 in step 509.

It would be appreciated that the de-scaling factors may be generated byany method. For example the de-scaling factors may be non time varyingpredetermined factors, or are time varying factors which are passed tothe decoder or calculated from information passed with the higher levelsignal (for example the received scale factors described above). Inother embodiments of the invention the de-scaling factors are calculatedfrom the core ‘lower’ layer parameters or from the synthesized signal.

Furthermore it would be appreciated that a de-scaling may comprise anynumber or combination of de-scaling actions with different factors usedin each separate de-scaling action.

The de-scaling action is shown in FIG. 3 by step 511.

The difference processor 409 furthermore performs an emphasis rescalingof the sub-vectors.

In a first embodiment of the invention a single emphasis factor iscalculated based on factors representing the ratio of the energy of theoriginal signal to the energy in the reconstructed signal.

In a first embodiment of the invention the energy of the original signalis estimated from the quantized sub-band scale factors. The quantizedsub-band scale factors are themselves generated by the differenceprocessor 409 by dequantizing the codebook indices representing thesub-band scale factors. In the same embodiment the energy of thereconstructed signal is estimated from the combined effect of a subsetof scale factors whose members are dependent on the MDCT sub-vectorspresent over the frequency range.

Each MDCT sub-vector index is a reference to a MDCT sub-vector, whoseconstituent components are frequency components arranged in an ascendingorder of frequency.

With respect to FIG. 4, an example of the operation of the firstembodiment of the invention in a decoder (together with the de-scalingprocess) is shown in further detail. In this example the sub-vectors aregrouped into sub-bands of sub-vectors. In the example below an optionalpredetermined scaling process is shown where in the encoder eachsub-vector within a sub-band, b, has been scaled by the same factorS_(b). The steps associated with this scaling may in other embodimentsof the invention be replaced by other de-scaling steps or may be missingfrom the process.

In the first step 201, the current sub-vector index is checked to see ifit is a valid index value, i.e. is it below the maximum index value.

If the sub-vector index is not valid the method moves to step 215otherwise if the current sub-vector index is valid then the method movesto step 203.

In step 203 the sub-band, b, associated with the index, i, isdetermined. The scaling factor, S_(b), associated with the sub-band isalso determined.

In the following step, step 205, the sub-vector index, i, is comparedagainst a list of received frequency sub-vectors to determine whetherthe current index is part of the current coding layer—in other words wasa MDCT sub-vector received representing the same index or frequencyindex.

If there is a sub-vector representing the current index, i, the methodpasses to step 207, else the method passes to step 217.

In step 207, the MDCT sub-vector associated with the current index isrecovered. The MDCT sub-vector is then descaled using the scaling factorS_(b).

In the following step, step 209, the sum of the energy of the vectorcomponents is calculated. For example each vector component is squaredand summed to give a mean square energy value for each MDCT sub-vector.

In the following step, step 211, the sum of the energy of the vectorcomponents calculated in step 209 is added to the current running totalenergy value E and the current running energy value for the currentcoding layer E_RxLayer. E may be seen to represent the energy of thefrequency coefficients present in the signal before higher layers werestripped from the bitstream. E_RxLayer may be seen to represent theenergy of the frequency coefficients present in the received codinglayers. It is to be appreciated that E and E_RxLayer may representrespective energy factors calculated over a frequency range which isdetermined by the number of sub-band groups.

In the following step, step 213, the index is incremented and the methodis returned to step 201.

In the step 217, the step following the step 205, where no MDCTsub-vector was received representing the same index or frequency index,the scale factor for the index S_(b) is squared and added to the currentrunning total energy value E. The method then passes to step 213 wherethe index is incremented and the method returned to step 201.

In step 215, the step following step 201 determining that the index, i,is not a valid index (i.e. the index has reached its maximum value), themethod then calculates the emphasis factor. In the first embodiment ofthe invention this emphasis is the square root of the ratio of the totalenergy E divided by the energy value of the coding layer E_RxLayer.

This emphasis factor is then applied to those constituent components ofthe of the MDCT sub-vectors over which the factor is calculated.

This method may be written as part of a c-programming language code suchas that shown below. In this instance the emphasis factor is calculatedfor the energy of the frequency coefficients received in the R4 layer.

void enhance_hf(float * y_norm, /* decoded scaled MDCT coefficients fora frame */     int start, /* first sub vector to be applied theenhancement */     float * scales, /* quantized additional scales forsubbands */     int * read_vect) /* flags indicating which sub vectorshave been read in R4 layer */ {  float  energy = 0.0, /* approximationof the energy of the original signal */ energy_R4=0.0; /* approximationof the energy of the reconstructed signal */  float factor;  int i,b; for(i=start;i<MAX_NO_VECT;i++) /* MAX_NO_VECT is the maximum number ofsub  vectors */ {    /* find to which subband the sub vector i belongsto */    b = find_subband(band_bin, BANDS, i*SPACE_DIM);    /* if thesub vector is read in R4*/    if (read_vect[i] == 1)    {     energy +=scales[b]*scales[b];     energy_R4 += scales[b]*scales[b];    }    else/* update only the energy corresponding to the original signal */    energy += scales[b]*scales[b];  }  factor =0.7*sqrtf(energy/energy_R4);  for(i=start;i<MAX_NO_VECT;i++)  {    if(read_vect[i] == 1)    /* multiply the reconstructed sub vector in orderto increase its energy */     vec_mul_s(&y_norm[i*SPACE_DIM], factor,&y_norm[i*SPACE_DIM], SPACE_DIM);  }  return; }

In further embodiments of the invention the emphasis factor describedabove may be modified by a further multiplication factor. This factormay be subjectively chosen or may be chosen by the difference processorto ‘tune’ the audio decoded signal. Typically the further multiplicationfactor is a value less than 1.

With regards to FIG. 5, an example of the operation of a secondembodiment of the invention in a decoder (together with the de-scalingprocess) is shown in further detail. In this example the sub-vectors arealso grouped into sub-bands of sub-vectors. In the example below anoptional predetermined scaling process is shown where in the encodereach sub-vector within a sub-band, b, has been scaled by the same factorS_(b). The steps associated with this scaling may in other embodimentsof the invention be replaced by other de-scaling steps or may be missingfrom the process.

In the following example both a sub-band index, b, and a sub-vectorindex, i, are defined. The sub-vector index may be independent butcapable of being mapped to the sub-band index or may be a sub-divisionof the sub-band index.

In the first step 301, the current sub-band index, b, is checked to seeif it is a valid index value, i.e. is it less than or equal to themaximum sub-band index value.

If the sub-band index is not valid the method moves to step 321 and themethod ends otherwise if the current sub-band index is valid then themethod moves to step 303.

In step 303 the scaling factor, S_(b), associated with the sub-bandindex, b, is determined.

In the following step, step 305, the sub-vector index, i, is comparedagainst a list of received frequency sub-vectors to determine whetherthe current sub-vector index is part of the current coding layer—inother words was a MDCT sub-vector received representing the samesub-vector index.

If there is a MDCT sub-vector representing the current index, i, themethod passes to step 307, else the method passes to step 319.

In step 307, the MDCT sub-vector associated with the current sub-vectorindex is recovered. The MDCT sub-vector is then descaled using thescaling factor S_(b).

In the following step, step 309, the sum of the energy of the vectorcomponents is calculated. For example each vector component is squaredand summed to give a mean square energy value for each MDCT sub-vector.

In the following step, step 311, the sum of the energy of the vectorcomponents calculated in step 309 is added to the current running totalenergy value E and the current running energy value for the currentcoding layer E_Rxlayer.

In the following step, step 313, the index is incremented. Furthermorethe incremented sub-vector index, i, is checked to determine whether thesub-vector is within the current sub-band index, b. If the incrementedsub-vector is in the current sub-band the method passes to step 305, ifnot the method passes to step 315.

In the step 319, the step following the step 305, where no MDCTsub-vector was received representing the same index or frequency index,the scale factor for the index S_(b) is squared and added to the currentrunning total energy value E. The method then passes to step 313 wherethe sub-vector index is incremented and checked to determine whether thesub-vector is within the current sub-band index, b.

In step 315, the step following step 313, the method then calculates theemphasis factor for the current sub-band, b. In one embodiment of theinvention this emphasis is the square root of the ratio of the totalenergy E divided by the energy value of the coding layer E_Rxlayer.

This ratio is then applied to all of the MDCT sub-vectors within thesub-band.

In step 317, the following step, the method then increments the sub-bandindex b and returns the method to step 301, where the method checks tosee if there are any more sub-bands to process.

This method for the exemplary embodiment may also be represented in c bythe following programming code. In this instance the emphasis factor iscalculated for the energy of the frequency coefficients received in theR4 layer.

void enhance_hf(float * y_norm, /* decoded scaled MDCT coefficients fora frame */     int start, /* first sub vector to be applied theenhancement */     float * scales, /* quantized additional scales forsubbands */     int * read_vect) /* flags indicating which sub vectorshave been read in R4 layer */ {  float  energy[BANDS], /* approximationof the energy of the original signal */ energy_R4[BANDS]; /*approximation of the energy of the reconstructed signal */  floatfactor[BANDS];  int i,b;  /* initializations to 0 */  vec_set(energy,0.0, BANDS);  vec_set(energy_R4, 0.0, BANDS);  /* initializations to 1.0*/  vec_set(factor, 1.0, BANDS);  for(i=start;i<MAX_NO_VECT;i++) /*MAX_NO_VECT is the maximum number of sub vectors */  {    /* find towhich subband the sub vector i belongs to */    b =find_subband(band_bin, BANDS, i*SPACE_DIM);    /* if the sub vector isread in R4*/    if (read_vect[i] == 1)    {      energy[b] +=scales[b]*scales[b];      energy_R4[b] += scales[b]*scales[b];    }   else /* update only the energy corresponding to the original signal*/      energy[b] += scales[b]*scales[b];  }  for(i=0;i<BANDS;i++)    if(energy_R4[b] > 0.0)        factor[b] =0.7*sqrtf(energy[b]/energy_R4[b]);  for(i=start;i<MAX_NO_VECT;i++)  {   if (read_vect[i] == 1){       /* find to which subband the sub vectori belongs to */    b = find_subband(band_bin, BANDS, i*SPACE_DIM);    /*multiply the reconstructed sub vector in order to increase its energy */    vec_mul_s(&y_norm[i*SPACE_DIM], factor[b], &y_norm[i*SPACE_DIM],SPACE_DIM);    }  }  return; }

This second embodiment is specifically advantageous as it is able toprovide emphasis for each separate sub-band separately. In embodimentsof the invention where sub-vectors are interlaced the reduction of thehigher level signals would result in at least some information for eachof the sub-bands being received and thus a wider bandwidth of differencesignals being reconstructed.

However even where interlacing is not employed the embodiments describedabove are advantageous as they are able to at least partially mitigatethe lost energy information by emphasising the values of any remainingMDCT sub-vectors. Such embodiments as described in relation to theinvention may therefore accentuate the received higher frequencies by ascaling factor to an extent such that the scaling factor used is relatedto the energy difference between the original signal spectrum and thereceived signal spectrum.

The embodiments shown above show a method for calculating the emphasisfactor for each sub-band on a vector by vector basis. However as wouldbe appreciated by the person skilled in the art other methods ofcalculation of the emphasis factor may be employed. For example theE_RxLayer value may be calculated as shown above when the index is partof the coding layer. The E value may then be calculated by taking theE_RxLayer value and then adding this value to the product of the S_(b) ²value to the number of times that the sub-vector index is not part ofthe coding layer.

This emphasis process is shown in FIG. 3 in step 513.

The output from the emphasis process is then passed to an inverse MDCTprocessor 411 which outputs a time domain sampled version of thedifference signal.

This inverse MDCT process is shown in FIG. 5 as step 515.

The time domain sampled version of the difference signal is then passedfrom the difference decoder 473 to the summing device 413 which incombination with the delayed synthesized signal from the coder decoder471 via the digital delay 410 produces a copy of the original digitallysampled audio signal.

This combination is shown in FIG. 5 by the step 517.

The above described a procedure using the example of a VMR audio codec.However, similar principles can be applied to any other multi-ratespeech or audio codec.

In the example provided above of the present invention the MDCT (andIMDCT) is used to convert the signal from the time to frequency domain(and vice versa). As would be appreciated any other appropriate time tofrequency domain transform with an appropriate inverse transform may beimplemented instead. Thus any orthogonal discrete transform may beimplemented. Non limiting examples of other transforms comprise: adiscrete Fourier transform (DFT), a fast Fourier transform (FFT), adiscrete cosine transform (DCT-I, DCT-II, DCT-III, DCT-IV etc), and adiscrete sine transform (DST).

The embodiments of the invention described above describe the codec 10in terms of a decoders 400 apparatus separate from an encoder in orderto assist the understanding of the processes involved. However, it wouldbe appreciated that the apparatus, structures and operations may beimplemented as a single encoder-decoder apparatus/structure/operation.Furthermore in some embodiments of the invention the coder and decodermay share some/or all common elements.

Although the above examples describe embodiments of the inventionoperating within a codec within an electronic device 610, it would beappreciated that the invention as described below may be implemented aspart of any variable rate/adaptive rate audio (or speech) codec wherethe difference signal (between a synthesized and real audio signal) maybe quantized. Thus, for example, embodiments of the invention may beimplemented in an audio codec which may implement audio coding overfixed or wired communication paths.

Thus user equipment may comprise an audio codec such as those describedin embodiments of the invention above.

It shall be appreciated that the term user equipment is intended tocover any suitable type of wireless user equipment, such as mobiletelephones, portable data processing devices or portable web browsers.

Furthermore elements of a public land mobile network (PLMN) may alsocomprise audio codecs as described above.

In general, the various embodiments of the invention may be implementedin hardware or special purpose circuits, software, logic or anycombination thereof. For example, some aspects may be implemented inhardware, while other aspects may be implemented in firmware or softwarewhich may be executed by a controller, microprocessor or other computingdevice, although the invention is not limited thereto. While variousaspects of the invention may be illustrated and described as blockdiagrams, flow charts, or using some other pictorial representation, itis well understood that these blocks, apparatus, systems, techniques ormethods described herein may be implemented in, as non-limitingexamples, hardware, software, firmware, special purpose circuits orlogic, general purpose hardware or controller or other computingdevices, or some combination thereof.

The embodiments of this invention may be implemented by computersoftware executable by a data processor of the mobile device, such as inthe processor entity, or by hardware, or by a combination of softwareand hardware. Further in this regard it should be noted that any blocksof the logic flow as in the Figures may represent program steps, orinterconnected logic circuits, blocks and functions, or a combination ofprogram steps and logic circuits, blocks and functions.

The memory may be of any type suitable to the local technicalenvironment and may be implemented using any suitable data storagetechnology, such as semiconductor-based memory devices, magnetic memorydevices and systems, optical memory devices and systems, fixed memoryand removable memory. The data processors may be of any type suitable tothe local technical environment, and may include one or more of generalpurpose computers, special purpose computers, microprocessors, digitalsignal processors (DSPs) and processors based on multi-core processorarchitecture, as non-limiting examples.

Embodiments of the inventions may be practiced in various componentssuch as integrated circuit modules. The design of integrated circuits isby and large a highly automated process. Complex and powerful softwaretools are available for converting a logic level design into asemiconductor circuit design ready to be etched and formed on asemiconductor substrate.

Programs, such as those provided by Synopsys, Inc. of Mountain View,Calif. and Cadence Design, of San Jose, Calif. automatically routeconductors and locate components on a semiconductor chip using wellestablished rules of design as well as libraries of pre-stored designmodules. Once the design for a semiconductor circuit has been completed,the resultant design, in a standardized electronic format (e.g., Opus,GDSII, or the like) may be transmitted to a semiconductor fabricationfacility or “fab” for fabrication.

The foregoing description has provided by way of exemplary andnon-limiting examples a full and informative description of theexemplary embodiment of this invention. However, various modificationsand adaptations may become apparent to those skilled in the relevantarts in view of the foregoing description, when read in conjunction withthe accompanying drawings and the appended claims. However, all such andsimilar modifications of the teachings of this invention will still fallwithin the scope of this invention as defined in the appended claims.

1. An apparatus comprising: a decoder configured to: receive a firstpart of an encoded audio signal; determine at least one scaling factorbased at least in part on the first part of the encoded audio signal;scale the first part of the encoded audio signal based at least in parton the at least one scaling factor to produce a scaled encoded audiosignal; and decode the scaled encoded audio signal.
 2. The apparatus asclaimed in claim 1, wherein the encoded audio signal comprises at leastone set of spectral values, and the first part of the encoded audiosignal comprises: at least one sub-set of spectral values, each sub-setof spectral values associated with one of the at least one set ofspectral values; and at least one set scaling factor, each set scalingfactor being associated with one of the at least one set of spectralvalues.
 3. The apparatus as claimed in claim 2, wherein each of the atleast one scaling factor is associated with one of the at least one setof spectral values, wherein the decoder is configured to scale thesub-set of spectral values associated with one of the at least one setof spectral values by the respective scaling factor.
 4. (canceled) 5.The apparatus as claimed in claim 3, wherein the first term of thescaling factor comprises the total spectral energy value of therespective sub-set of spectral values; and wherein the total spectralenergy value if the respective sub-set of spectral values comprises atleast one of: a combination of an absolute value of each spectral valueof the respective sub-set of spectral values; and a combination of asquared value of each spectral value of the respective sub-set ofspectral values.
 6. (canceled)
 7. A decoder as claimed in claim 5,wherein each set scaling factor comprises at least one of: the averageenergy per spectral value for the respective set of spectral values; theaverage energy per spectral value for all sets of spectral values. 8.The apparatus as claimed in claim 5, wherein the second term comprisesthe combination of the first term and the product of the respective setscaling factor and a multiplier, and wherein the decoder is furtherconfigured to determine the value of the multiplier by subtracting thenumber of spectral values in the respective sub-set of spectral valuesfrom the number of spectral values in the set of spectral values 9.(canceled)
 10. The apparatus as claimed in claim 3, further configuredto determine the number of spectral values in a set of spectral values;and for each of the number of spectral value in a set of spectral valuesthe decoder is configured to: determine whether each of the number ofspectral values is within the sub-set of spectral values; accumulate thesecond term by the set scaling factor when the decoder determines thatthe spectral value is not within the sub-set of spectral values; andaccumulate the first term and the second term by a respective sub-setspectral value when the decoder determines that the spectral value is inthe sub-set of spectral values, 11-12. (canceled)
 13. The apparatus asclaimed in claim 3, wherein each scaling factor comprises the first termnormalised by the second term. 14-15. (canceled)
 16. The apparatus asclaimed in claim 1, wherein each scaling factor comprises the ratio ofthe first term to the second term, wherein the received encoded audiosignal comprises individual coding layers and wherein the at least onescaling factor is an emphasis scaling factor. 17-18. (canceled)
 19. Amethod comprising: receiving a first part of an encoded audio signal;determining at least one scaling factor based at least in part on thefirst part of the encoded audio signal; scaling the first part of theencoded audio signal based at least in part on the at least one scalingfactor to produce a scaled encoded audio signal; and decoding the scaledencoded audio signal.
 20. A method as claimed in claim 19, wherein theencoded audio signal comprises at least one set of spectral values, andthe first part of the encoded audio signal comprises: at least onesub-set of spectral values, each sub-set of spectral values associatedwith one of the at least one set of spectral values; and at least oneset scaling factor, each set scaling factor being associated with one ofthe at least one set of spectral values.
 21. A method as claimed inclaim 20, wherein each of the at least one scaling factor is associatedwith one of the at least one set of spectral values, wherein the scalingthe first part of the encoded audio signal comprises scaling the sub-setof spectral values associated with one of the at least one set ofspectral values by the respective scaling factor; and whereindetermining at least one scaling factor comprises determining a firstterm dependent on the respective sub-set of spectral values anddetermining a second term dependent on the first term and the respectiveset scaling factor.
 22. (canceled)
 23. A method as claimed in claim 22,wherein determining the first term comprises determining the totalspectral energy value of the respective sub-set of spectral values, andwherein determining the total spectral energy value of the respectivesub-set of spectral values comprises at least one of: determining acombination of an absolute value of each spectral value of therespective sub-set of spectral values; and determining a combination ofa squared value of each spectral value of the respective sub-set ofspectral values.
 24. (canceled)
 25. A method as claimed in claim 23,wherein each set scaling factor comprises at least one of: the averageenergy per spectral value for the respective set of spectral values; theaverage energy per spectral value for all sets of spectral values.
 26. Amethod as claimed in claim 22, wherein determining the second termcomprises combining the first term and a product of the respective setscaling factor and a multiplier, and wherein the method furthercomprises determining the value of the multiplier by subtracting thenumber of spectral values in the respective sub-set of spectral valuesfrom the number of spectral values in the set of spectral values. 27.(canceled)
 28. A method as claimed in claim 22, further comprising:determining a number of spectral values in a set of spectral values; andfor each of the number of spectral value in a set of spectral values themethod comprises determining whether the spectral value is within thesub-set of spectral values; accumulating the second term by the setscaling factor when the spectral value is determined to not be withinthe sub-set of spectral values; accumulating the first term and thesecond term by the respective sub-set spectral value when the spectralvalue is in the sub-set of spectral values. 29-30. (canceled)
 31. Amethod as claimed in claim 22, wherein determining each scaling factorcomprises normalising the first term by the second term. 32-33.(canceled)
 34. A method as claimed in claim 19, wherein determining eachscaling factor comprises the ratio of the first term to the second term,wherein the received encoded audio signal comprises individual codinglayers, and wherein the at least one scaling factor is an emphasisscaling factor. 35-38. (canceled)
 39. A computer program product inwhich a software code is stored in a computer readable medium, whereinsaid code realizes the following when being executed by a processor:receiving a first part of an encoded audio signal; determining at leastone scaling factor based at least in part on the first part of theencoded audio signal; scaling the first part of the encoded audio signalbased at least in part on the at least one scaling factor to produce ascaled encoded audio signal; and decoding the scaled encoded audiosignal.
 40. (canceled)